Self reference in word definitions

نویسندگان

  • David Levary
  • Jean-Pierre Eckmann
  • Elisha Moses
  • Tsvi Tlusty
چکیده

Dictionaries are inherently circular in nature. A given word is linked to a set of alternative words (the definition) which in turn point to further descendants. Iterating through definitions in this way, one typically finds that definitions loop back upon themselves. The graph formed by such definitional relations is our object of study. By eliminating those links which are not in loops, we arrive at a core subgraph of highly connected nodes. We observe that definitional loops are conveniently classified by length, with longer loops usually emerging from semantic misinterpretation. By breaking the long loops in the graph of the dictionary, we arrive at a set of disconnected clusters. We find that the words in these clusters constitute semantic units, and moreover tend to have been introduced into the English language at similar times, suggesting a possible mechanism for language evolution. Introduction. – Words are the building blocks of language. By stringing together chains of these simple blocks, complex thoughts and ideas can be conveyed. For a language to be effective, this transmission must not only be precise but also efficient. Indeed, the continuous expansion of human languages tends to be driven less out of a need to express concepts that were previously uncommunicable, than by the constraint that concepts be transmitted rapidly. As a result of this need for efficient communication, the human lexicon is not a simple 1 to 1 mapping of concepts onto words, but rather a complex web of semantically related parts. Network based formulations of human language have been employed previously to study language evolution. In this approach, words are considered to be the nodes of a graph with edges drawn based on a variety of possible relationships such as word co-occurrence in texts, thesauri, or word association experiments on human users [1, 2]. Such language networks tend to be scale-free and exhibit the small-world effect (i.e., nodes are separated from one another by a relatively small number of edges), characteristics shared by many other complex, empirically observed networks [2]. The notion of a dictionary based graph, in which directed links are drawn between a word and the words in its definition, was proposed early on in view of using computational tools [3]. Dictionaries provide an important tool for studying the relationship between words and concepts by linking a given word to a set of alternative words (the definition) which can express the same meaning. Of course, the given definition is not unique. One might just as well replace all of the words in the definition of the original word in question, with their respective definitions. In the graph of the dictionary then, a word and its set of descendants can be viewed as semantically equivalent. Recently, the overall structure of this dictionary graph was analyzed [4]. It was found that dictionaries consist of a set of words, roughly 10% the size of the original dictionary, from which all other words can be defined. This subgraph was observed to be highly interconnected, with a central strongly connected component dubbed the core. The authors then studied the connection of this finding with the acquisition of language in children. The existence of the core reflects an important property of the dictionary, namely its requirement that every word have a definition (i.e., a non-zero out-degree). The absence of “axiomatic” words whose definition is assumed results in a graph with a large number of loops, which is p-1 ar X iv :1 10 3. 23 25 v1 [ cs .C L ] 1 1 M ar 2 01 1

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Validity and reliability of self-reported diabetes in the Atherosclerosis Risk in Communities Study.

The objective of this study was to assess the validity of prevalent and incident self-reported diabetes compared with multiple reference definitions and to assess the reliability (repeatability) of a self-reported diagnosis of diabetes. Data from 10,321 participants in the Atherosclerosis Risk in Communities (ARIC) Study who attended visit 4 (1996-1998) were analyzed. Prevalent self-reported di...

متن کامل

Development of Word Definition Skill in Persian-speaking 54-90-Month-Olds

Objectives: Word definition skill is a complex language ability in which meta-linguistic awareness and literacy skills play a critical role. The present study examined the development of word definition skills in Persian-speaking children aged 4.5 to 7.5 years, concerning content and form aspects. Methods: This was a cross-sectional and analytic-descriptive study. The study subjects were 107 c...

متن کامل

Reflections on the meaning of clinician self-reference: are we speaking the same language?

Self-reference refers to clinician revelations about themselves. Theory and research on self-reference are limited by a lack of uniform conceptualizations. This paper discusses two types of self-reference, self-disclosure, and self-involving responses. Included are definitions of each type of self-reference; description of definitional inconsistencies in the literature; discussion of prevalence...

متن کامل

بررسی و مقایسه رشد جنبه محتوایی مهارت تعریف واژه در دانش‌آموزان 7 تا 12 ساله فارسی‌زبان

Objective Language has three components: content, form and pragmatic. The content includes the semantic components. Semantic knowledge of word relationships requires awareness of the relationships between different words in the same field and other fields. One of the main components of the semantic is the mental lexicon that many of the semantic communications, including the organization and se...

متن کامل

The effects of levels of processing on retention of word meaning

Levels of Processing 2 The purpose of the study was to investigate the effects of the three encoding techniques of rote memory, semantic, and self-reference, on short-term and long-term retention levels of unfamiliar vocabulary words and their meanings. Seventy-two college students participated in the experiment, with 24 students in each encoding group. All participants viewed 20 target words a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1103.2325  شماره 

صفحات  -

تاریخ انتشار 2011